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Abstract 

Two bootstrapping or resampling strategies were investigated, as 
applicable to e5;tiinating standard errors and ensuing confidence 
intervals on variance components ^.n two--factor random ANOVA 
models. In light of prior negative findings regarding the 
^application of bootstrapping to this particular problem, a 
recommendation of an "optimal" approach to resampling was sought. 
The study used Monte Carlo simulations to test the variance 
component estimation accuracy under simultaneous resampling of 
all effect factors in a random model verses resampling a single 
factor. The results indicated that single--f actor was a 
preferrable method of resampling and produced reasonable 
estimates of both standard errors and confidence intervals 
(parametric and non-parametric) . Additional suggestions for 
appropriate application of the technique are discussed. 
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Bootstrapping Variance Components - 1 
Introduction 

The application of variance component estimates stemming 
from analysis of variance (ANOVA) and other quadratic forms of 
linear models (Searle, 1971) to educational research settings has 
continued to increase in recent years. These variance component 
estimators provide important information about measurement 
parameters of interest (e.g. generalizability theory, Cronbach, 
Gleser, Rajaratnam and Nanda, 1972, Brennan, 1983) as well as 
experimental effect sizes and intraclass correlations. However, 
in either measurement or experimental applications, researchers 
usually require some confidence about the accuracy of the 
estimators obtained. 

Unfortunately, many of the attempts to define the 
distributions of variance components have involved tedious and 
complex algorithms and there remains to some extent a lack of 
consensus about the most appropriate distributional form to use 
(Searle, 1971, Smith, 1982). Nonetheless, the issue of the 
accuracy of variance component estimators can be dealt with, 
despite any controversy over distributional form, if normality 
and orthogonality of the data are assumed. That is, 
distributional properties of variance component estimators can be 
sought. The most common such property is the sampling variance 
of the variance components (Searle, 1971, Smith, 1978 and 
Brennan, 1983) , which can be directly extended to estimating 
confidence intervals. However, as Smith (1982) demonstrated, even 
under the assumptions of normality and orthogonality, estimation 
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of the sampling variances of variance components and thus any 
ensuing confidence intervals may, at best, be marginal. 

For this latter reason, more recent attempts to look at the 
distributional properties of variance components have considered 
the use of empirical confidence intervals (Brennan, Harris and 
Hanson, 1987, Smith, Luecht and Anderson, 1988). Under this 
approach, multiple samples are drawn and the desired confidence 
interval is determined directly from the percentiles (e.g. 0.05 
and 0.95) of the distribution of samples. However, in practice, 
the acquisition of multiple samples may not be feasible. 
Accordingly, researchers have needed to look at alternatives for 
establishing empirical confidence intervals on variance 
components . 

One method of estimating empirical confidence intervals from 
single samples of data has been termed "bootstrapping". The 
general bootstrap technique described by Efron (1979, 1982) is a 
resampling approach to estimating ^nfidence intervals upon 
statistical parameters of interest. The technique involves 
rebuilding multiple data sets from a single sample data set. 
That is, an initial data set is resampled, with replacement, 
until a new data set is constructed, matching in size the 
original data set. The statistical parameter estimates of 
interest are computed and another data set is then drawn from the 
sample, again with replacement, and analyzed. This process of 
resampling continues until a large number of data sets have been 
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Bootstrapping Variance Components ^ 3 

constructed and analyzed. Empirical confidence intervals can 
then be directly estimated as percentile point equivalents in the 
derived distribution of resampled or reconstructed data sets. 

The bootstrap technique has been successfully implemented 
for many applications (Chatterjee, 1984, Lunneborg and 
Tousignant, 1985, Lunneborg, 1985, Iventosch, 1987), however, the 
use of bootstrapping for obtaining confidence intervals on 
variance components has only mot with marginal success. Brennan, 
Harris and Hanson (1987), looKed at the bootstrap technique for 
developing confidence intervals in measurement situations and 
concluded that the method was somewhat ineffectual. It should be 
noted, however, that Brennan et al. used a single replication 
(data s^t) which may have limited their findings. 

Smith, Luecht a^U Anderson (1988) extended the work of 
Brennan et al. (1987) by investigating bootstrapping under three 
orthogonal designs (a two factor crossed design, a three factor 
crossed design and a three factor nested design) . Using Monte 
Carlo simulations involving many replications of data sets across 
a variety of design sizes. Smith et al. were able perform a 
series of large-scale tests of the bootstrap methodology with 
respect to confidence interval estimation and estimation of the 
sampling variances of the variance components in their designs. 

In general, the results obtained by Smith et al. (1988) were 
somewhat less than favorable. The point estimates of the 
variance components (mean and median values) tended to 
overestimate the theoretical values (used to generate the data 
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sets) of the variance components for main effects and 
under est iiuate any residual terms in their linear models. Since 
the data used by Smith et al* was controlled to simulate 
normality, both the parametric (use of the sampling variance) and 
non-parametric (empirical) confidence intervals produced similar 
results. However, those estimated results showed under- and 
overestimation inconsistency with the expected values of the 
theoretical sampling variances. 

These marginal findings would ordinarily suggest that the 
bootstrap method holds little promise when applied to the problem 
of variance component estimation. However, two key points were 
alluded to but not specifically investigated in both studies 
(Brennan et al. ,1987, Smith et al., 1988). The first point 
concerns the size of data set(s) being resampled under 
bootstrapping. As Efron (1982) suggests, the variance of a 
statistical parameter of interest should take the form 

A n — 1 A 

n 

Clearly, the size of the design under which the resampling 
takes place will impact the underestimation of the total 
variance, (cj*). The variance components, as independent linear 
parameters, for example, 

a*y = o\ + o\ (2) 
can be expected to likewise be restricted by any underestimation 
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« 

on the total variance, a*y in Equation (2). 

Second, different strategies employed during the 
bootstrapping can be envisioned to have differential effects on 
the obtained variance components. For example. Smith et al. 
(1988) simultaneously resampled all possible independent 
parameters (i.e. main effect factors — a strategy similarly used 
by Brennan et al.,- 1987). That is, Smith et al. did not look at 
the potential of alternative bootstrapping strategies (e.g. 
resampling only one factor in a design). Although Brennan et al. 
did consider the issue alternative bootstrapping strategies, 
their use of a single data set may have precluded any positive 
findings. 

These two points therefore provide the primary objectives of 
the present study. Under the assumptions of normality and 
orthogonality of linear designs, this study (1) evaluates the 
effect of design size on the estimation of variance components 
and distributional estimators (parametric sampling variances and 
non-parametric confidence intervals) and (2) seeks to provide 
some undei standing of various resampling strategies, ultimately 
arriving at one "recommendable" strategy for resampling to 
estimate variance components. 

Preliminary to an empirical investigation of these 
objectives, an initial attempt is made to describe the potential 
or expected impact of bootstrapping strategies and sample sizes 
upon variance components estimators. The next section describes 
the derivations of the variance components for a two factor 
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random analysis of variance model, without replications and 
comments upon that expected impact as it relates to the 
objectives. Following that section, methods and results are 
provided for a two-part Monte Carlo study, meant to formally test 
various bootstrapping strategies across a variety of sample 
sizes. 

Derivation of Variance Components under Bootstrapping 
The present study considers a rather basic two factor random 
effects analysis of variance (ANOVA) design, without 
replications. This particular design was chosen because of its 
generality to testing contexts (e.g. generalizability theory, 
Cronbach et al, 1972, Brennan, 1983) and many experimental 
:iattings involving repeated measures. Although more complex 
crossed and nested designs were considered (see Smith et al., 
19S8) , there seemed to be no explicit reason to include them 
here. 

Under this ANOVA design the two random effects factors, A 
and B, have respective variance component estimators as follows 
(Brennan, 1983) : 



Additionally, the residual variance, confounds the interaction of 
the A and B factors with the error term (since no cell replicates 
are involved) in the form 




= (MSa - MS;,B^g) / b 
= (MSg - MS;^B,e) / a • 



(3) 



(4) 
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Finally, the sampling variance of the estimators in Equations 
(3) , (4) and (5) can be expressed as 

V^(aV = [[^*A + (^*AB,e/b)l + (^*AB,e/b)*] (6) 

V^(ct*b) = — — [[^*B + (^VB,e/a)l + (^*AB,e/a)*l (7) 

(a - 1) (b - 1) 
as suggested by Smith (1978). 

Smith (1978) came to the conclusion that fairly large 
numbers of levels of the involved factors were needed to 
establish stable confidence intervals, based upon the sampling 
variances (i.e. to reduce the standard errors of the variance 
component estimators), even under normality assumptions. 
Generally, for designs of this type. Smith recommended that n^^ng 
equal 800. 

If that consideration of sample size is extended to 
bootstrapping, then the reduction of the total variance under 
bootstrapping (see Equation (1)) can be expected to further 
confound the variance component estimators and their sampling 
variances at different sample sizes. In short, the total 
variance should be reduced for smaller sample sizes. 
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For example, if normality cf the data in this A x B design 
is assiimed, then Equation (1) can be adapted to demonstrate the 
impact of both sample size and bootstrapping strategy upon the 
total variance, using degrees of freedom appropriate to this 
ANOVA model {ab -1), such that 

^ ab - 1 ^ 

^V(boot) = • 

In other words, as a — > «> and Jb — > «>, the bootstrapped 
estimate of the total variance will approach the unbiased 
estimator of the theoretical total variance, which itself will 
approach the population value as the levels of a and Jb approach 
infinity. Correspondingly, for fairly small levels of a and/or 
Jb, there will be a reduced total variance estimate. For example, 
if the model of this ANOVA design 

is considered, the impact of the underestimation of a'y should 
clearly extend to all variance components, as well. 

It is at this point where the issue of bootstrapping 
strategies needs to be dealt with. If we assume a strategy of 
simultaneous bootstrapping of both the A and B factors in (10) 
then, as Brennen, Harris and Hanson (1987) demonstrate, the 
estimator of o*j^ as a function of the average unbiased 
covariances among the levels of factor B, becomes 
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^ ' b(b-l) L L J a - 1 

(for i not equal to i') and the estimator of cj*g, as a function 
of the covariances in factor A, likewise yields 

"BCboot) -^j^.^j h^j.l^i JJ (12) 

(for J not equal to j'). As Brennan et al. further suggest, 
these estimators imply an overestimation of the variance 
components for factors A and B, especially where a and b are 
small. If the residual is then obtained as a subtractive 
function of the bootstrapped total variance in (-9), ^^y(boot)' 
less the overestimated ^^^^(boot) ^*B(boot) components, by 

A 

adaptively solving Equation (10) for ^*AB,e(boot) ' ^® therefore 
see that any overestimation of the main effect variance 
components will result in a proportional underestimation of the 
residual term as a function of both the restriction on the total 
variance and overestimation of the A and B factor components. 

In fact, using this form of simultaneous bootstrapping. 
Smith, Luecht and Anderson (1988) actually performed a large- 
scale validation this precise effect, with some restriction on 
the limits of their design sizes (a 50 x 20 matrix was the 
largest design considered) . The expectation might be that the 

A /S. 

overestimation in o^*A(boot) ^*B(boot) (^^^ corresponding 

underestimation in the residual term) would be further confounded 
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if the levels in factors A and B were highly disproportionate. 
The present study seeks to recapitulate that disproportionate 
effect over a more varied range of sample sizes. 

In contrast, consideration must also be given to a 
bootstrapping strategy which only resamples along a single 
factor. Under this strategy, levels of a single factor are 
resampled with the levels of the crossed factor automatically 
chosen for each selected level in the bootstrapped factor. For 
example, if only factor A is resampled, then for each level of A 
selected, all levels of factor B crossed at that level are 
automatically chosen. An approximation of the variance component 
estimators as a function of the cross factor covariances could, 
of course, be modeled as exemplified by Equations (11) and (12) ; 
however, a more straight- forward explanation appears warranted. 

If the resampling occurs only in the levels of one factor, 
for example, factor A for the model in (10) , then the 
expectations of the variance components (per Equation (1) , Efron, 
1982) should take the following form 



A 



a 



-1a 



A (boot) 




(13) 



a 



and 



(a - 1) 



A 



^*AB,e(boot) 



a 



AB,e 



(14) 



A 



Where the estimator, ^^^B(boot) Equation (12) , as a function of 
the average unbiased covariances across factor A remains 
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unchanged. In other words, underestimation is constrained to a 
single factor in the design, which ideally can be controlled by 
increasing the number of levels, sinc3 for this example, as a. 

A. 

approaches «>, cj^^boot) likewise approaches the unbiased estimator 
of a*j^. Since the covariances in factor A additionally determine 
^*B(boot) ' that same increase in the levels of factor A can be 
expected to reduce the overestimation problem implicit in factor 
B. Finally, the residual variance component estimator under this 
approach to bootstrapping should be slightly underestimated, but 
only to a degree proportional to the levels in factor A (i.e. for 
fairly small resampling levels in A) . 

The remainder of this paper deals x/ith an empirical test of 
the recommendations offered in this section and seeks to 
demonstrate single factor or "one-way" bootstrapping as a 
recommended strategy for estimating variance components and 
sampling variances, under the constraint of sample sizes 
appropriate for the resampling application. 

Methods 

Monte Carlo data sets were used to simulate the effects of 
bootstrapping strategies and sample sizes, under the rationale 
suggested in the prior section of this paper. A two-phase study 
was implemented. 
Phases of the investigation 

In the first phase of this study, the issues of sample size 
effects and adequacy of bootstrapping strategies were 
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simultaneously addressed. For this phase, multiple data sets 
were generated under the A x B paradigm and each data set was 
resampled (bootstrapped) 200 times. 'Data sets comprised of four 
different sample sizes were used: (1) a = 20 x Jb = 20; (2) a = 
150 X jb = 20; (3) a = 20 x jb = 150; and (4) a = 150 x Jb = 150 

For each data set and its resampling cycle (bootstrapping 
sequence) , two strategies were implemented. First, each data set 
was bootstrapped with respect to both the A and B factors, 
effectively drawing a sample of each parameter matching the 
original levels in the design. This method of resampling both 
parameters (factors) of interest was suggested by Brennan et al. 
(1987) and the method of choice for Smith et al. (1988). At the 
same time, the resampled A-levels only were used under the second 
strategy. That is, all crossed levels of the B factor were 
automatically chosen whenever a particular level of the A factor 
was selected for the bootstrap sample. Since the sampling rates 
varied rotationally across factors (by 20 and 150 levels) , this 
latter approach of holding resampling constant in the A factor 
seemed sufficient to address the issue of resampling a single 
parameter in contrast to resampling all parameters in a linear 
model. 

For each data set and its resampling sequence of 200 
bootstraps, the point estimates of the average of the variance 
components were saved and the sampling variance of those average 
estimators were computed.. 
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In the second phase of this study, the direct estimation of 
confidence intervals was addressed. Here, the intent was to 
demonstrate the adequacy of bootstrapping only one parameter in 
the design (factor A) , as suggested by the rationale and 
approximations in the prior section. As previously noted, the 
study by Smith et al. (1988) never tested this particular 
bootstrapping strategy. 

476 data sets having 100 levels of the A factor and 50 
levels of the B factor were generated. Using 200 bootstraps for 
each data set, the empirical confidence intervals (as well as the 
average point estimates and sampling variances) were saved for 
each replication. 

Data Generation and Analysis Algorithms 

The Monte Carlo data sets in both phases of this study were 
constructed to conform to a precise theoretical distribution. In 
each case, random normal deviates were scaled to "known" values 
cf the variance components underlying the data. That is, each 
data set had an a priori, theoretical set of constants set at a*^ 
= 0.25, = 0.25 and o^j^^^^ = 0.50, such that a*y equaled 1, 

with known contributions of the component variances, as stated. 

The generation of the data sets, bootstrapping, analysis of 
variance and estimation of variance components and sampling 
variances were programmed and run on an IBM-AT compatible 
microcomputer with math co-processing capabilities accurate to 17 
or 18 decimal places* The computational algorithms were further 
validated against results obtained via the Systat 4.0 MGLH module 
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(Systat, 1988), using smaller data sets. 

Results and Conclusions 

The first phase of this investigation was meant to model the 
expectations of a*^ estimators under bi^otstrapping by contrasting 
two resampling strategies across a variety of design sizes: (1) 
two-way resampling of both the A and 3 factors and (2) resampling 
of only the levels of factor A. In the second phase of analysis, 
the latter method of bootstrapping (resampling a single factor — 
here factor A) was re-investigated in terms of adequacy in 
establishing empirical (non-parametric) confidence intervals for 
fairly large data sets (100 x 50) . A comparison of 
distributional parameters and confidence intervals estimated via 
those parameters was also incorporated into this phase of the 
study • 

Phase I; comparison of Bootstrap Strategies 

Tables 1, 2, 3 and 4 present the results from this 
bootstrapping comparison phase, representing data sets of 20 x 
20, 150 X 20, 20 X 150 and 150 X 150, respectively. Descriptive 
statistics for each variance component under single-factor 
bootstrapping are shown in the leftmost three columns of values 
for each table. The rightmost three columns of values in each 
table display the results from simultaneous or two-way 
bootstrapping • 

Simultaneous (Two-Way) Bootstrapping 

Tables 1-4 demonstrate a consistent overestimation of the 
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factor A and B variance component point estimators (mean and 
median values across data rats) , as described by Equations (11) 
and (12) . Likewise, the anticipated underestimation of the 
residual terms in those tables is clearly evident. However, one 
anomaly surfaces which was alluded to earlier. Most noticeable 
in Tables 2 and 3, when there is d. 'TDroportionality between the 
numbers of levels of the A and B facrors, the overestimation 
problem favors the factor having more levels, and may even 
underestimate the smaller crossed factor. 

Under that same condition of disproportionality , the 
estimators of the sampling variances, parametric coi.fidence 
intervals and non^-parametric confidence intervals also appear to 
fluctuate dramatically from expectation. It therefore seems 
apparent that simultaneous bootstrapping (two-way resampling for 
A X B designs) can be applied, but only under two highly 
restrictive constraints. First, the levels of the resampled 
factors must be approximately equivalent. Second, fairly large 
data matrices (e.g. 50 x 50 or greater) would be required to (a) 
reduce the overestimation of the main effects variance components 
and (b) bring the residual estimators up to a reasonable level. 

For more complex designs (see Smith, Luecht and Anderson, 
1988) , the overestimation and underestimation problems would 
appear to create even more restrictions of mentionable concern to 
researchers (e.g. a 50 x 50 x 50 data matrix for a three-way 
crossed design is hardly practical to obtain in most experimental 



ERIC 



18 



Bootstrapping Variance Components -'IS 

or even testing contexts) . 

For these reasons and staying with the recommendations made 
by Brennan et al. (19L7) and Smith et al. (1988), it seems 
reasonable to discount simultaneous bootstrapping as a viable 
approach to obtaining usable confidence intervals or 
distributional estimators as a general application. 
Single-'Factor Bootstrapping 

The point estimators (means and medians) in Tables 1 to 4, 
under single factor bootstrapping appear consistent with the 
expectations derived in Equations (12) , (13) and (14) . That is, 
there is a tendency to underestimate the bootstrapped factor 
(factor A) and overestimate the non-bootstrapped factor (factor 
B) . Also, esp^icially for designs having sample numbers of levels 
on the bootstrapped factor (see Table 1 and Table 3) , the 

A 

residual term, or^^g^^^j^^Q^j , tends toward very slight 
underestimation. It should be noted that this mild apparent 
overestimation of the residuals in Tables 2 and 4 is, technically 
speaking, not overestimation at all. Rather, consistent with the 
theory of bootstrapping (Efron, 1982) , the residual estimators 
will approach the sample estimator as a — > oo and b — > oo. of 

A. 

course, the sample estimator, o^*AB,e' itself approach the 

theoretical population residual, as the levels of factors in the 
design approach infinity (Searle, 1971). 

The sampling variances (theoretical, expected and estimated) 
also demonstrate a close correspondence. Likewise, under single- 
factor bootstrapping, both the non-parametric (empirical) and 
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parametric provide similar information. 

Assuming, as we have done from the onset, that the purpose 
behind bootstrapping is to estimate empirical confidence 
intervals, it therefore seems appropr.'iate to conclude that 
single-factor bootstrapping appears to succeed where simultaneous 
bootstrapping could not. That is, bootstrapping only one factor 
in a design should provide reasonable estimation of the variance 
component parameters and confidence intervals around those 
parameters, under the constraint of having sufficient levels of 
the bootstrapped factor. For example, the greater overestimation 
of variance component for factor B in the 20 x 150 design (Table 
3) is not seen in the 150 x 20 design (Table 2). That is, by 
increasing the number of levels on [single] bootstrapped factor, 
estimation bias for all factors appears to be reasonably 
controlled. Second, the underestimation effect on the residual 
term seems minimized under single-factor resampling. In contrast 
to the even more dramatic underestimation of residuals discovered 
by Smith et al. (1988) for more complex designs, it appears that 
single-factor bootstrapping will provide consistent and 
reasonable estimators, provided the resampled factor has 
sufficient numbers of levels. 

phase II: j^alysis of the Sinqle'^Factor Boobs trap Strategy 

Table 5 presents the results from the Phase II analysis of 
476 data sets (using the A x B random model) , with a equaling 100 
levels and Jb equaling 50. Beyond point estimators and expected 

20 
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sampling variances of the estimators, an additional feature was 
added during this analysis phase. That feature was the 
calculation of the lower and upper bound percentile estimates and 
variance estimates for the components for each bootstrapping 
sequence of 200 iterations, across all dcita sets. 

Accordingly, it becomes possible to compare four different 
sets of confidence intervals on the estimators: (1) the 
parametric 95% confidence intervals, using the actual sampling 
variance estimate of the components, (2) the non-parametric 
confidence intervals (percentile points set at 2.5% and 97.5%) 
for the data set point estimators, (3) the mean value of the 
lower and upper percentile points from each bootstrapped data set 
and (4) the median value of the lower and upper percentile points 
from each bootstrapped data set* 

It is quite clear that, in general, the lower and upper 
bound estimators provide similar information, forming fairly 
symmetrical intervals around the theoretical values of the 
components. However, one interesting condition arises when 
considering the mean and median lower and upper estimators of 
^*B(boot) ' across data sets. The sampling variances (see mean 
and median estimated sample variances in Table 5) derived for 
each bootstrapped sample are noticeably smaller than expectation; 
also evidenced by the restricted limits on the interval. While 
none of the other sampling variance estimators for ^^^^i^oot) 
suggest such lesser variation., the reduction is nonetheless quite 
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distinct, and seems to occur only at the level of each 
bootstrapped data set. By way of explanation, it should be 
realized that the sampling variance VAR(a^g^j.^Q^j ) is actually a 
variance of a covariance function (see Equation [12]). Since the 
sampling variance form of that covariance is unknown, but can be 
expected to be a restricted form (i.e. subject to the levels 
actually resampled in factor A) a smaller amount of variation 
seems logical. Although the interval on the ^*B(boot) estimators 
is "tighter", is still reasonably symmetrical and captures all 
pertinent point estimators, this anomaly nonetheless suggests a 
biased estimation problem. That is, the statistic may be useful 
for estimating the covariance due to bootstrapping, but should be 
treated cautiously as a valid es^timator of the actual sampling 
variance of a^g. Furthermore, any confidence intervals derived 
for factor B, may be misleading. It may be possible to overcome 
this sampling variance problem by independently bootstrapping 
only factor B in a secondary analysis; however, further research 
may be warranted on that count. 

As a final note on the results, it should be be noticed that 
the point estimators (means and medians) , while consistent with 
the earlier theoretical derivations in Equations (12;, (13) and 
(14) , in terms of over- and underestimation the Phase I results, 
strongly suggest the effect of estimation control which can be 
accomplished by increasing the numbers of levels in the resampled 
factor^ 

It therefore seems reasonable to suggest that single-factor 
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bootstrapping presents a reasonable alternative to estimating 
variance coiaponents, under the constraint of adequate sampling 
(at least for the bootstrapped factor and residual) • Of course, 
discovery of an "optimal" number of levels to resample remains a 
question for additional study. Also, the applicability of this 
technique to more complex designs or nonorthogonal designs, or 
merely non-normal data, requires further work. 
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Table 1 

A Comparison of Bootstrapping Strategies for 
a - 20 and b « 20 



Value of Theoretical 
Variance Component 

Variance of 
Theoretical Component 

Mean Estimator 
of Component 

Median Estimator 
of Component 

Actual Variance of 
Component Estimators 

Mean of Expected 
Sampling Variances 

Median of Expected 
Sampling Variances 

Parametric Lower Bound 
Estimator (95%) 

Parametric Upper Bound 
Estimator (95%) 

Non-parametric Lower 
Bound Estimator (2.5%) 



(No. of Data Sets - 481) 

Single Factor 
Bootstrapping 



Bootstrapping 
All Factors 



B 



AB,e 



B AB,e 



0 . 2500 0 . 2500 0 . 5000 0 . 2500 0 . 2500 0 . 5000 

0 . 0080 0 . 0080 0 . 0014 0 . 0080 0 . 0080 0 . 0014 

0.2361 0.2865 0.4795 0.2708 0.2877 0.4543 

0.2307 0.2752 0.4972 0.2607 0.2756 0,4527 

0 . 0070 0 . 0091 0 . 0014 0 . 0082 0 . 0090 0 . 0013 

0.0079 0.0111 0.0013 0.0093 0.0111 0.0012 

0.0068 0.0094 0.0013 0.0085 0.0094 0.0013 

0.0721 0.0995 0.4062 0.0933 0.1018 0.3836 

0.4001 0.4735 0.5528 0.4483 0.4736 0.5250 

0.0952 0.1341 0.4024 0.1217 0.1363 0.3819 



Non-parametric Upper 

Bound Estimator (97.5%) 0.4192 0.4925 0.5513 



0.4577 0.4925 0.5294 



ERLC 



25 



Bootstrapping Variance components - 23 



Table 2 

A Comparison of Bootstrapping Strategies for 
a « 150 and b - 20 

(No. of Data Sets - 416) 

Single Factor Bootstrapping 
Bootstrapping All Factors 



B AB,e A B AB,e 



Value of Theoretical 

Variance Components 0.2500 0.2500 0.5000 0.2500 0.2500 0.5000 
Variance of 

Theoretical Component 0.0005 0.0068 0.0002 0.0005 0.0068 0.0002 
Mean Estimator 

of Component 0.2514 0.2455 0.5029 0.3526 0.2429 0.4777 
Median Estimator 

of Component 0.2487 0.2410 0.5042 0.2760 0.2279 0.4791 
Actual Variance of 

Component Estimators 0.0012 0.0084 0.0007 0.1402 0.0078 0.0006 
Mean of Expected 

Sampling Variances O.OOIC 0.0079 0.0002 0.orj8 0.0072 0.0002 
Median of Expected 

Sampling Variances 0.0010 0.0063 0.0002 0.0012 O.OOOo 0.0001 
Parametric Lower Bound 

Estimator (95%) 0.1835 0.0658 0.4510 -0.3810 0.0698 0.4297 
Parametric Upper Bound 

Estimator (95%) 0.3192 0.4251 0.5547 1.0865 0.4160 0.5257 
Non-parametric Lower 

Bound Estimator (2.5%) 0.1930 0.1205 0.4771 0.2195 0.1140 0.4523 
Non-parametric Upper 

Bound Estimator (97.5%) 0.3164 0.4461 0.5329 1.7959 0.4530 0.5064 
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Table 3 

A Comparison of Bootstrapping Strategies for 
a « 20 and b « 150 

(No. of Data Sets - 553) 

Single Factor Bootstrapping 
Bootstrapping All Factors 



Value of Theoretical 
Variance Component 

Variance of 
Theoretical Component 

Mean Estimator 
of Component 

Median Estimator 
of Component 

Actual Variance of 
Component Estimators 

Mean of Expected 
Sampling Variances 

Median of Expected 
Sampling Variances 

Parametric Lower Bound 
Estimator (95%) 

Parametric Upper Bound 
Estimator (95%) 

Kon-parametric Lower 
Bound Estimator (2,5%) 

Non-parametric Upper 
Bound Estimator (97.5%) 



A B AB,e 

0.2500 0.2500 0.5000 

0.0068 0.0005 0.0002 

0.2491 0.2807 0.4807 

0.2343 0.2806 0.4813 

0.0068 0.0011 0.0001 

0.0070 0.0013 0.0002 

0.0059 0.0012 0.0002 

0.0874 0.2157 0.4530 

0.4107 0.3457 0.5084 

0.1102 0.2160 0.4575 

0.4091 0.3461 0.5040 



A B AB,e 

0.2500 0.2500 0.5000 

0.0068 0.0005 0.0002 

0.2452 0.2788 0.4775 

0.2375 0.2783 0.4776 

0.0068 0.0011 0.0001 

0.0072 0.0012 0.0002 

0.0061 0.0012 0.0002 

0.0836 0.2138 0.4579 

0.4068 0.3481 0.4971 

0.1129 0.2144 0.4550 

0.4143 0.3461 0.5002 
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Table 4 

A Comparison of Bootstrapping Strategies for 
a - 150 and b - 150 



Value of Theoretical 
Variance Component 

Variance of 
Theoretical Component 

Hean Estimator 
of Component 

Median Estimator 
of Component 

Actual Variance of 
Component Estimators 

Mean of Expected 
Sampling Variances 

Median of Expected 
Sampling Variances 

Parametric Lower Bound 
Estimator (95%) 

Parametric Upper Bound 
Estimator (95%) 

Non-parametric Lower 
Bound Estimator (2,5%) 



(No. of Data Sets » 117) 

Single Factor 
Bootstrapping 



Bootstrapping 
All Factors 



B 



AB,e 



B 



AB,e 



0 . 2500 0 . 2500 0 . 5000 0 . 2500 0 . 2500 0 . 5000 

0 . 0009 0 . 0009 0 . 0000* 0 . 0009 0 . 0009 0 . 0000* 

0.2481 0.2568 0.5030 0.2524 0.2545 0.4987 

0 . 2467 0 . 2560 0 . 5041 0 . 2504 0 . 2548 0 . 4995 

0 . 0012 0 . 0010 0 . 0001 0 . 0012 0 . 0010 0 . 0001 

0 . 0009 0 . 0009 0 . 0000* 0 . 0009 0 . 0009 0 . 0000* 

0 . 0008 0 . 0009 0 . 0000* 0 . 0009 0 . 0009 0 . 0000* 

0.1893 0.1980 0.4932 0.1936 0.1957 0.4889 

0.3069 0.3156 0.5128 0.3112 0.3133 0.5085 

0.1882 0.2087 a.4922 0.1923 0.2055 0.4888 



Non-parametric Upper 

Bound Estimator (97.5%) 0.3060 0.3250 0.5116 



0.3080 0.3184 0.5074 



* <0. 00005 
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Table 5 

A Comparison of Confidence Interval Estimators 
under Single -Factor Bootstrapping for 
a - 100 and b - 50 



(No. of Data Sets - 476) 



B 



AB.e 



Theoretical Components 
Theoretical Sampling Variances 
Mean Estimators 
Median Estimators 
Variance of Estimators 
Skewness of Estimators 
Kurtosis of Estimators 
Mean of Exp. Sampling Variances 
Median of Exp. Sampling Variances 
Mean of Est. Sample Variances 
Median of Est. Sample Variances 
Parametric Lower Bound (95%) 
Parametric Upper Bound (95%) 
Lower Bound of Estimators (2.5%) 
Upper Bound of Estimators (97.5%) 
Mean Boot Lower Bound (2.5%) 
Mean Boot Upper Bound (97.5%) 
Median Boot Lower Bound (97.5%) 
Median Boot Upper Bound (97.5%) 



0.25000 

0.00137 

0.25267 

0.24870 

0.00200 

0.46947 

0.15032 

0.00143 

0.00135 

0.00187 

0.00157 

0.16502 

0.34032 

0.17510 

0.35550 

0.17645 

0.33687 

0.17150 

0.33300 



0.25000 

0.00265 

0.26257 

0.25560 

0.00376 

0.54686 

0.07724 

0.00348 

0.00278 

0.00015 

0.00013 

0.14239 

0.38275 

0.16440 

0.40150 

0.24006 

0.28577 

0.23370 

0.27820 



0.50000 
0.00010 
0.50043 
0.50100 
0.00027 
■0.49107 
3.03976 
0.00010 
0.00010 
0.00016 
0.00014 
0.46822 
0.53264 
0.46250 
0.53150 
0.47576 
0.52370 
0.47740 
0.52300 
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